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Abstract 

Using Trades and Quotes data from the Paris stock market, we show 
that the random walk nature of traded prices results from a very delicate 
interplay between two opposite tendencies: long-range correlated market 
orders that lead to super-diffusion (or persistence), and mean reverting 
limit orders that lead to sub-diffusion (or anti-persistence). We define 
and study a model where the price, at any instant, is the result of the 
impact of all past trades, mediated by a non constant 'propagator' in time 
that describes the response of the market to a single trade. Within this 
model, the market is shown to be, in a precise sense, at a critical point, 
where the price is purely diffusive and the average response function almost 
constant. We find empirically, and discuss theoretically, a fluctuation- 
response relation. We also discuss the fraction of truly informed market 
orders, that correctly anticipate short term moves, and find that it is quite 
small. 
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1 Introduction 



The Efficient Market Hypothesis (EMH) posits that all available information is 
included in prices, which emerge at all times from the consensus between fully 
rational agents, that would otherwise immediately arbitrage away any deviation 
from the fair price |2J- Price changes can then only be the result of un- 
anticipated news and are by definition totally unpredictable. The price is at any 
instant of time the best predictor of future prices. One of the central predictions 
of EMH is thus that prices should be random walks in time which (to a good 
approximation) they indeed are. This was interpreted early on as a success of 
EMH. However, as pointed out by Schiller, the observed volatility of markets 
is far too high to be compatible with the idea of fully rational pricing [3]. The 
frantic activity observed in financial markets is another problem: on liquid stocks, 
there is typically one trade every 5 seconds, whereas the time lag between of 
relevant news is certainly much larger. More fundamentally, the assumption of 
rational, perfectly informed agents seems intuitively much too strong, and has 
been criticized by many jH |S| . Even the very concept of the fair price of a 
company appears to be somewhat dubious. 

There is a model at the other extreme of the spectrum where prices also follow 
a pure random walk, but for a totally different reason. Assume that agents, 
instead of being fully rational, have zero intelligence and take random decisions 
to buy or to sell, but that their action is interpreted by all the others agents 
as potentially containing some information. Then, the mere fact of buying (or 
selling) typically leads to a change of the ask a(t) (or bid b(t)) price and hence 
of a change of the midpoint m(t) = [a(t) + b(t)]/2. In the absence of reliable 
information about the 'true' price, the new midpoint is immediately adopted 
by all other market participants as the new reference price around which new 
orders are launched. In this case, the midpoint will also follow a random walk (at 
least for sufficiently large times), even if trades are not motivated by any rational 
decision and devoid of meaningful information. 1 This alternative, random trading 
model has been recently the object of intense scrutiny, in particular as a simplified 
approach to the statistics of order books |H1 O HOI HTJ H^l HH1 H31 [T3J . Since the 
order flow is a Poisson process, this assumption is quite convenient and leads to 
tractable analytical models ^J^l- Perhaps surprisingly, many qualitative (and 
sometimes quantitative) properties of order books can be predicted using such 
an extreme postulate [T2J H31 CH HZ] . 

Of course, reality should lie somewhere in the middle: clearly, the price cannot 
wander arbitrarily far from a reasonable value, and trades cannot all be random. 
The interesting question is to know which of the two pictures is closest to reality 
and can be taken as a faithful starting point around which improvements can be 

J That this simplistic model also leads to a random walk behaviour for prices has also very 
recently been pointed out in [7]. 
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perturbatively added. 

In this paper, we want to argue, based on a series of detailed empirical results 
obtained on trade by trade data, that the random walk nature of prices is in 
fact highly non trivial and results from a fine-tuned competition between two 
populations of traders, liquidity providers ('market-makers') on the one hand, 
and liquidity takers (sometimes called 'informed traders', but see the discussion in 
Section 4). For reasons that we explain in more details below, liquidity providers 
act such as to create anti-persistence (or mean reversion) in price changes that 
would lead to a sub-diffusive behaviour of the price, whereas liquidity takers' 
action leads to long range persistence and super-diffusive behaviour. Both effects 
very precisely compensate and lead to an overall diffusive behaviour, at least to 
a first approximation, such that (statistical) arbitrage opportunities are absent, 
as expected. However, one can spot out the vestiges of this subtle compensation 
from the temporal structure of the market impact function (which measures how 
a given trade affects on average future prices). 

The organization of this paper is as follows. We first present (Section 2) 
our empirical results on the statistics of trades, market impact and fluctuations. 
We show in particular that the order flow exhibits long range autocorrelations 
in time, but that this does not lead to any predictability in price changes, as 
also recently noticed in Then, we introduce in Section 3 a simple model 
that expresses the price as a linear superposition of the impact of each trade. 
We show that this model allows to rationalize our empirical findings, provided a 
specific relation between the temporal autocorrelation of the sign of the trades 
(i.e. buyer initiated or seller initiated) and the temporal response to a single 
trade is satisfied. Finally, in Section 4, we give intuitive arguments that allow 
one to understand the market forces at the origin of this subtle balance between 
two opposite effects, which dynamically leads to absence of statistical arbitrage 
opportunities. We argue that in a very precise sense, the market is sitting on a 
critical point; the dynamical compensation of two conflicting tendencies is similar 
to other complex systems such as the heart 18], driven by two antagonist systems 
(sympathetic and para-sympathetic), or certain human tasks, such as balancing of 
a long stick ^J] • The latter example illustrates very clearly the idea of dynamical 
equilibrium, and shows how any small deviation from perfect balance may lead to 
strong instabilities. This near instability may well be at the origin of the fat tails 
and volatility clustering observed in financial data (see e.g. [2011211 1221 Effi I2H I25]). 
Note that these two features are indeed present in the 'balancing stick' time series 
studied in [T§] . 
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2 Market impact and fluctuations 



2.1 Presentation of the data and definitions 

In this study, we have analyzed trades and quotes data from liquid French stocks 
in the years 2001 and 2002, although qualitatively similar results were also ob- 
tained on British stocks as well. The advantage of the French market, however, is 
that it is fully electronic whereas only part of the volume is traded electronically 
in the London stock exchange. We will illustrate our results mainly using the 
France- Telecom stock, which is one of the most actively traded stocks, for which 
statistics are particularly good. 

There are two data files for each stock: one gives the list of all successive 
quotes, i.e. the best buy (bid, b) and sell (ask, a) prices, together with the 
available volume, and the time stamp accurate to the second. A quote can change 
either as a result of a trade, or because new limit orders appear, or else because 
some limit orders are canceled. The other data file is the list of all successive 
trades, with the traded price, traded volume and time stamp, again accurate to 
the second. Sometimes, several trades are recorded at the very same instant but 
at different prices: this corresponds to a market order of a size which exceeds 
the available volume at the bid (or at the ask), and hits limit orders deeper in 
the order book. In the following, we have grouped all these trades together as 
a single trade. This allows one to create chronological sequences of trades and 
quotes, such that between any two trades there is at least one quote. 

The last quote before a given trade allows one to define the sign of each trade: 
if the traded price is above the last midpoint m — (a + b)/2, this means that the 
trade was triggered by a market order (or marketable limit order) to buy, and we 
will assign to that trade a variable e — +1. If, one the other hand the traded price 
is below the last midpoint m = (a + b)/2, then e = —1. With each trade is also 
associated a volume V, corresponding to the total number of shares exchanged. 

Trades appear at random times, the statistics of which being itself non trivial 
(there are intra-day seasonalities and also clustering of the trades in time). We 
will not be interested in this aspect of the problem and always reason in terms of 
trade time, i.e. time advances by one unit every time a new trade (or a series of 
simultaneous trades) is recorded. We have also systematically discarded the first 
ten and the last ten minutes of trading in a given day, to remove any artifacts 
due to the opening and closing of the market. Many quantities of interest in the 
following are two-time observables, that is, compare two observables at (trade) 
time n and n + I. In order to avoid overnight effects, we have restricted our 
analysis to intra-day data, i.e. both n and n + 1 belong to the same trading day. 
We have also assumed that our observables only depend on the time lag t. 

On the example of France- Telecom, on which we will focus mostly, there are 
on the order of 10 000 trades per day. For example, the total number of trades 
on France- Telecom during 2002 was close to 2. 10 6 ; this allows quite accurate 
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statistical estimates of various quantities. The volume of each trade was found to 
be roughly log-normally distributed, with (\nV) ~ 5.5 and a root mean square 
of AlnV ~ 1.8. The range of observed values of \nV is between 1 and 11. 

2.2 Price fluctuation and diffusion 

The simplest quantity to study is the average mean square fluctuation of the price 
between (trade) time n and n + £. Here, the price p n is defined as the mid-point 
before the nth trade: p n = m n -. In this paper, we always consider detrended 
prices, such that the empirical drift is zero. We thus define T>(£) as: 

V(£) = (( Pn+e - Pn f). (1) 

As is well known, in the absence of any linear correlations between successive 
price changes, T>(£) has a strictly diffusive behaviour, i.e. 

V{£) = D£, (2) 

where D is a constant. In the presence of short-ranged correlations, one expects 
deviations from this behaviour at short times. However, on liquid stocks with 
relatively small tick sizes such as France- Telecom (FT), one finds a remarkably 
linear behaviour for T>(£), even for small £. The absence of linear correlations in 
price changes is equivalent to saying that (statistical) arbitrage opportunies are 
absent, even for high frequency trading. In fact, in order to emphasize the differ- 
ences from a strictly diffusive behaviour, we have studied the quantity \JV{£)/£ 
(which has the dimension of Euros). We show this quantity in Fig. 1 for FT, 
averaged over three different periods: first semester of 2001 (where the tick size 
was 0.05 Euros), second semester of 2001, and the whole of 2002 (where the tick 
size was 0.01 Euros). One sees that T>(£)/£ is indeed nearly constant, with a 
small 'oscillation' on which we will comment later. Similar plots can be observed 
for other stocks (see Fig. 2). We have noted that for stocks with larger ticks, a 
slow decrease of V(£)/£ is observed, corresponding to a slight anti-persistence (or 
sub-diffusion) effect. 

The conclusion is that the random walk (diffusive) behaviour of stock prices 
appears even at the trade by trade level, with a diffusion constant D which is 
of the order of the typical bid-ask squared. From Fig. 1, one indeed sees that 
^T>{\) ~ 0.01 Euros, which is precisely the tick size, and FT has a typical bid- 
ask spread equal to one or two ticks. This coincidence is interesting. It might 
suggests that price changes are to a large extent induced by the trading activity 
itself, independently of real news (unless of course if the news flow is itself on 
the scale of seconds and that each news item has an impact on the price that is 
commensurate to the bid-ask spread). Much stronger arguments in favor of this 
point, based on estimates of the fraction of informed trades, will be given below. 



5 




Figure 1: Plot of yV(£)/£ as a function of £ for France- Telecom, during three 
different periods. The variation of T>{£)/£ with £ is very small, in particular in 
the small tick (0.01 Euros) period (July 2001 - December 2002). For the large 
tick size period (0.05 Euros; January 2001 - June 2001), there is a systematic 
downward trend: see also Fig. 2. 
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Figure 2: Plot of \jT>{£)/£ as a function of £ for other stocks during the year 
2002, except Barclays (May- June 2002). The y-axis has been rescaled arbitrarily 
for clarity. We note that stocks with larger tick size tend to reveal a stronger 
mean-reverting effect. 
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This conclusion seems to imply that the price may, on the long run, wander 
arbitrarily far from the fundamental price, which would be absurd. However, 
even if one assumes that the fundamental price is independent of time, a typical 
3% noise induced daily volatility would lead to a significant (say a factor 2) 
difference between the traded price and the fundamental price only after a few 
years [2E|. Since the fundamental price of a company is probably difficult to 
determine better than within a factor two, say (see e.g. 0123), one only expects 
fundamental effects to be relevant on very long time scales (as indeed suggested 
by the empirical results of de Bondt and Thaler j28j), but that these are totally 
negligible on short (intra-day) time scales of interest here. We will in fact see 
below (cf. Eq. (|27|) that the reference price that market participants seem to 
have in mind is in fact a short time average of the past price itself, rather than 
any fundamental price. 

2.3 Response function and market impact 

In order to better understand the impact of trading on price changes, one can 
study the following response function TZ(£), defined as: 

Tl(£) = ((Pn+e ~ Pn) ■ £n) , (3) 

where e n is the sign of the ra-th trade, introduced in Section 2.1. The quantity 
7l(£) measures how much, on average, the price moves up conditioned to a buy 
order at time (or a sell order moves the price down) a time I later. As will be 
clear below,this quantity is however not the market response to a single trade, a 
quantity that will later be denoted by G . A more detailed object can in fact be 
defined by conditioning the average to a certain volume V of the n-th trade: 

K(£,V) = ((pn+t-Pn)-S n )\ Vn=V . (4) 

Previous empirical studies have mostly focused on the volume dependence of 
1Z(£,V), and established that this function is strongly concave as a function of 
the volume |2H1 HUB EH E3 El- In |3H] . a thorough analysis of U.S. stocks was 
analyzed in terms of a piecewise power-law dependence for 1Z{£ — 1,V) oc V a , 
with an exponent a ~ 0.4 for small volumes, and a smaller value (a ~ 0.2) 
for larger volumes. In a previous publication j^l], some of us have proposed 
that this dependence might in fact be logarithmic (see also a footnote in |32j): 
1Z(£ = 1,V) = -Rilnl^ (where R\ is a stock dependent constant), a law that 
seems to satisfactorily account for all the data that we have analyzed. The 
empirical determination of the temporal structure of 1Z(£, V) has been much less 
investigated (although one can find in j22] somewhat related results on a coarse- 
grained version of 7Z(£,V)). Preliminary empirical results, published in 13 lj . 
reported that 7Z(£, V) could be written in a factorized form (first suggested on 
theoretical grounds in [12J): 

1Z(£,V)nn(£)f(V); f(V)cxlnV, (5) 



7 



0.014 



0.012 



□ 




-<■> o o o ■© o 



0.01 



p 




0.002 







10 



100 

Time (Trades) 



1000 



10000 



Figure 3: Average response function TZ(£) for FT, during three different periods 
(black symbols). We have given error bars for the 2002 data. For the 2001 data, 
the y— axis has been rescaled to best collapse onto the 2002 data. Using the same 
rescaling factor, we have also shown the data of Fig. 1. The fact that the same 
rescaling works approximately for T>{£) as well will be dwelled further in Section 
2.4 below. 

where 1Z(£) is a slowly varying function that initially increases up to I ~ 100—1000 
and then is seen to decrease back, with a rather small overall range of variation. 
The initial increase of R{£) was reported in |29| and has also recently been noticed 
by Lillo and Farmer [T7j . Here, we provide much better data that supports both 
the above assertions. We show for example in Fig. 3 the temporal structure of 
1Z(£) for France Telecom, for different periods. Note that 1Z(£) increases by a 
factor ~ 2 between £ = 1 and £ = £* ~ 1000, before decreasing back. Similar 
results have been obtained for many different stocks as well: Fig. 4 shows a small 
selection of other stocks, where the non monotonous behaviour of 1Z(£) is shown. 
However, in some cases (such as Pechiney), the maximum is not observed. One 
possible reason is that the number of daily trades is in this case much smaller 
(~ 1000), and that £* is beyond the maximum intra-day time lag. 

The existence of a time scale £* beyond which TZ(£) decreases is thus both sta- 
tistically significant, and to a large degree independent of the considered stock. 
On the other hand, the amplitude of the change of 1Z(£) seems to be stock de- 
pendent. As will be clear later, the slowly varying nature of 1Z{£) and the fact 
that this quantity reaches a maximum are non trivial results that will require a 
specific interpretation. 

Turning now to the factorization property of 1Z(£,V), Eq. (jSJ), we illustrate 
its validity in Fig. 5, where TZ(£, V) / f(V) is plotted as a function of £ for different 
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Figure 4: Average response function 1Z(£) for a restricted selection of stocks, 
during the year 2002. 

values of V. The function f(V) was chosen for best visual rescaling, and is found 
to be close to f(V) = InV, as expected. Note that for the smallest volume 
(open circles), the long time behaviour of 1Z(£, V) seems to be different, which is 
probably due to the fact that small volumes are in fact more likely to be large 
volumes chopped up into small pieces. 

One has to keep in mind that the response function TZ(£) captures a small 
systematic effect that relates the average price change to the sign of a trade. 
However, the fluctuations around this small signal are large, and increase with t. 
A way to see this is to introduce the random variable ui = (p n+ e — p n ).e n . By 
definition, 1Z(£) is the average of ue, and T>(£) is the average of uj. Since 1Z(£) is 
roughly constant whereas T>(£) grows linearly with £, one sees that the impact of 
a given trade (as measured by TZ(£)) rapidly becomes lost in the fluctuations. 

In Fig. 6, we show the whole empirical distribution P(ue) of ue for £ = 128 
(but other values of £ lead to similar results). This distribution is found to be 
only slightly skewed in the direction of positive u^. In fact, if one considers the 
shifted variable — z/, where v = 0.01 Euros, the distribution becomes nearly 
symmetric. Note that 0.01 Euros is equal to half the typical bid-ask spread and 
can therefore be seen as the cost of a market order. The Efficient Market picture 
suggests that the non zero value of (ue) should mostly be due to a small fraction 
of informed trades, that correctly anticipate large price changes as a result of 
some private information, while most noise induced trades should only change 
the price on short time scales, before arbitrageurs set it back to its 'true' value. 
In this case, the positive tail of the distribution P(ue) (corresponding to informed 
trades) should be much fatter than the negative tail. This asymmetry can in fact 
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Figure 5: Average response function 1Z(£, V), conditioned to a certain volume V, 
as a function of £. Data for different Vs have been divided by f(V) oc In V such 
as to obtain good data collapse. The thick line corresponds to 1Z(£) (unsealed). 
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Figure 6: Probability distribution P(ue) of the quantity ue = (p n+ e —p n )-£n ( m 
Euros), for I = 128. The data is again FT during 2002. The negative part of the 
distribution has been folded back to positive u e in order to highlight the small 
positive skew of the distribution (which is seen to increase slightly with \u(\). 
The average value TZ{£) = (up) is shown as the vertical dashed line. The dashed- 
dotted line corresponds to the distribution of ue — v with v = 0.01 Euros. This 
curve has been shifted upwards for clarity. 
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be taken as an objective measure of the fraction of informed trades. However, the 
nearly symmetric shape of P(ug — v) shown in Fig. 6 means that one can hardly 
detect the statistical presence of informed trades that correctly anticipate the sign 
of the price change on a short term basis, such as to at least cover their trading 
costs. 2 This result is consistent with the conclusion of other studies, where it 
is established that investors 'trade too much'JSS], and that the uninformed price 
pressure is large |7|. Note that P(ui) as defined above gives an equal weight to 
all trades, independently of their volume. We have also considered the volume 
weighted P{ug), which leads to the same qualitative conclusion. 

The main conclusions of this section are thus that (a) large volumes impact 
prices on average much less (in relative terms) than smaller volumes, (b) the 
average impact of a given trade (as measured by H(£)) increases with time up to 
a certain time scale £* beyond which it decreases and (c) the fraction of trades 
that correctly anticipates short term moves is small. 

2.4 A Fluctuation- Response relation 

In the study of Brownian particles, a very important result that dates back to 
Einstein relates the diffusion coefficient D to the response of the particle to an ex- 
ternal force. That a similar relation might also hold in financial markets was first 
suggested by Rosenow [36J, and substantiated there by some empirical results. 
We have performed an analysis related to, but different from that of Rosenow. 
For any given trading day, one can compute the average local diffusion constant 
T>(£) over a given time scale, say £ = 128, and the average local price response 
TZ(£) over the same time scale. Rosenow, on the other hand, computes a 'suscep- 
tibility' as the slope of the average price change over a given time interval versus 
the volume imbalance during the same time interval (see [H2]), and relates this 
susceptibility to the diffusion constant. The analogue of Rosenow's result [36 a 
(which was motivated by a Langevin equation for price variations - see [HI]), is 
a linear relation between 7Z 2 (£) and T>(£), which we illustrate in Fig. 7 for FT, 
for two different periods (first semester of 2001, and 2002). A similar result can 
also be read from Fig. 3. As will be clear in the following, such a relation will 
appear naturally within the simple model that we introduce in Section 3. 

2.5 Long term correlation of trade signs 

All the above results are compatible with a 'zero intelligence' picture of financial 
markets, where each trade is random in sign and shifts the price permanently, 
because all other participants update their evaluation of the stock price as a 

2 Some of these trades might of course be profitable on the long run. But since the price 
process is nearly diffusive and that the number of buy and sell market orders are nearly equal, 
it is clear that difference between the fraction of profitable trades on any given time scale and 
50% is small. 
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Figure 7: Average diffusion constant D = T>(£)/£, computed for £ = 128, and 
conditioned to a certain value of 7Z 2 (£), also computed for £ = 128 (FT). The 
open symbols correspond to 2002, whereas the black symbols are computed using 
the first semester of 2001, where the tick size was 5 times larger. Correspondingly, 
the x-axis was rescaled down by a factor 25 and the y-axis by a factor five for 
this data set. 

function of the last trade. As shown in [§1 HUJ C21 ECU UH EES]- a model of the 
order book based on a purely random order flow indeed allows one to go quite 
far in the quantitative understanding of financial markets. In this context, the 
concave shape of the impact as a function of the volume can be understood as 
an order book effect, where the average size of the queue increases with depth. 3 

This model of a totally random stock market is however qualitatively incorrect 
for the following reason. Although, as mentioned above, the statistics of price 
changes reveals very little temporal correlations, the correlation function of the 
sign e n of the trades, on the other hand, reveals very slowly decaying correlations. 
This correlation has been mentioned in some papers before, see e.g. [7j. Here, 
we propose that these correlations decay as a power-law of the time lag. 

More precisely, one can consider the following correlation function: 

C (£) = (e n+i e n ) - (e n ) 2 (6) 

If trades were random, one should observe that C (£) decays to zero beyond a few 
trades. Surprisingly, this is not what happens: on the contrary, Cq{£) is strong 

3 However, other effects are probably important to understand this concavity, such as the 
conditioning of large market orders to the size of the order book - see [SI 13 ■ 
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and decays very slowly toward zero, as an inverse power-law of £ (see Fig. 8): 

Cb(^§, (£>!)■ (7) 

The value of 7 seems to be somewhat stock dependent. For example, for FT, one 
finds 7 ~ 1/5, whereas for Total 7 ~ 2/3. In their study, Lillo and Farmer found 
a somewhat larger value of 7 ~ 1/2 for Vodafone ^7]. In any case, the value of 7 
is found to be smaller than one, which is very important because the integral of 
Cq(£) is then divergent. Now, as will be shown more precisely in the next section, 
the integral of Cq(£) can intuitively be thought of as the effective number N e of 
correlated successive trades. Hence, out of - say - 1000 trades, one should group 
together 

1000 (~i 
N e ~ 1 + y Cq(£) « 1 + — — lOOO 1 " 7 (8) 
£1 !~7 

'coherent' trades. For FT, 7 « 1/5 and Cq ~ 0.2, which means that the effect 
of one trade should be amplified, through the correlations, by a factor N e 50 ! 
In other words, both the response function 1Z and the diffusion constant should 
increase by a factor 50 between £ — 1 and £ = 1000, in stark contrast with the 
observed empirical data. This is the main puzzle that one should try to elucidate: 
how can one reconcile the strong, slowly decaying correlations in the sign of the 
trades with the nearly diffusive nature of the price fluctuations, and the nearly 
structureless response function? 

Before presenting a mathematical transcription of the above question and 
proposing a possible resolution, let us comment on two related correlation func- 
tions that will naturally appear in the following, namely: 

C l {£) = (e n+t e n \nV n ), (9) 

and 

C 2 (£) = (e n+e lnV n+ t e n lnV n ). (10) 
We have found empirically that these two 'mixed' correlation functions are pro- 
portional to Co 00 ( see Fig 8): 

C x {£) « (lnV)C {£); C 2 (£) « (lnV) 2 C (£). (11) 

There are however small systematic deviations, which indicate that (i) small 
volumes contribute more to the long range correlations that larger volumes and 
(ii) \nV — (In V) is a quantity exhibiting long range correlations as well. 

3 A micro-model of price fluctuations 
3.1 Set up of the model 

In order to understand the above results, we will postulate the following trade 
superposition model, where the price at time n is written as a sum over all past 
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Figure 8: Volume weighted sign autocorrelation functions as a function of time 
lag: Co, Ci, C2 (see text for definitions). The straight line corresponds to £ -7 with 
7 = 1/5. The dotted lines correspond to the simple approximation given by Eqs. 

trades, of the impact of one given trade propagated up to time n: 

Pn= G o( U - n ') £ n> ln K' + Vn', (12) 

n'<n n'<n 

where Gq(.) is the 'bare' impact function (or propagator) of a single trade, that we 
assume to be a fixed, non random function that only depends on time differences. 
The r] n are also random variables, assumed to be independent from the e n and 
model all sources of price changes not described by the direct impact of the 
trades: the bid-ask can change as the result of some news, or of some order 
flow, in the absence of any trades. We will in the following assume that the r\ n 
are also uncorrelated in time, although this assumption can easily be relaxed. 
In the above model, we assume that the 'bare' impact function Go is not itself 
fluctuating, which can only be an approximation. 

The bare impact function Gq(£) represents by definition the average impact 
of a single trade after £ trades. It could be in principle measured empirically by 
launching on the market a sequence of real trades of totally random signs, and av- 
eraging the impact over this sample of trades (a potentially costly experiment!). 
As will be clear below, the difference between the quantity 1Z(£) introduced in 
the previous Section and Gq(£) in fact comes from the strong autocorrelation of 
the sign of the trades. In order to understand the temporal structure of G (£), 

4 Howcvcr, following this procedure might induce 'copy-cat' trades and still lead to a differ- 
ence between the measured response function and Gq 
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note that a single trade first impacts the midpoint by changing the bid (or the 
ask). But then the subsequent limit order flow due to that particular trade might 
either center on average around the new midpoint (in which case Gq(£) would be 
constant), or, as we will argue below, tend to mean revert toward the previous 
midpoint (in which case Gq{£) decays with £). As discussed below (see Eq. 
the asymptotic behaviour of the bare impact function in fact reveals the average 
cost of a single market order: if G (£ 3> 1)/Gq(1) is small, the cost is large since 
the initial impact of the trade is only temporary, and is not followed by a true 
long term change of the price. 

Using this representation, the price increment between an arbitrarily chosen 
initial time and time £ is: 

Pe~Po= G (£-n)e n \nV n + ^2[G (£-n) -G (-n)]e n \iaV n + Vn- 

0<n<£ n<0 0<n<i 

(13) 

If the signs e n were independent random variables, both the response function 
and the diffusion would be very easy to compute. For example, one would have: 5 

Kt{£) = {kiV)G (£), (14) 

i.e. the observed impact function and the bare response function would be pro- 
portional. Similarly, one would have: 

V t (£) = (\n 2 V) ( £ Gl(n) + Y[Go^ + n)-G (n)} 2 ) + D v £, (15) 

\0<n<e n>0 J 

where D v is the variance of the r/'s. In the simplest case of a constant bare impact 
function, G (£) = T Q for all £ > 0, one then finds a pure diffusive behaviour, as 
expected: 

V t (£)=£[(ln 2 V)T 2 + D v ]. (16) 

This result (no correlations between the e's and a constant bare impact function) 
corresponds to the simplest possible zero intelligence market. However, we have 
seen that in fact the e's have long range correlations. In this case, the average 
response function reads: 

TZ t (£) = (\nV)G (£)+ ]T G? (^-n)Ci(n) + X;[Go^ + n)-G? (n)]Ci(n). (17) 

0<n<£ n>0 

Note in passing that our trade superposition model, Eq. (|12jl. together with Eq. 
flTTj) leads to the factorization property mentioned above (see Fig. 5): 

n t {£,V) = ^K t {£). (18) 



5 In the following, we will use the subscript 't' to denote the theoretical expressions for the 
response function or diffusion. 
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Now, one sees more formally the paradox discussed in the previous Section: as- 
suming that the impact of each trade is permanent, i.e. Gq(£) = T Q , leads to: 



(\nV)+ £ d(n) 

0<n<£ 



(19) 



If Ci(n) decays as a power-law with an exponent 7 < 1, then the average impact 
1Z(£) should grow like £ 1-7 , and therefore be amplified by a very large factor as £ 
increases, at variance with empirical data. The only way out of this conundrum 
is (within the proposed model) that the bare impact function G (£) itself should 
decay with time, in such a way to offset the amplification effect due to the trade 
correlations. 

3.2 A relation between the bare propagator and the sign 
correlation function 

In order to get some guidance, let us now look at the general formula for the 
diffusion. After a few lines of calculations, one finds: 



V t {£) = (In 2 VO 



E Gl(e-n) + Y,[G (£ + n)-G (n)Y 

0<n<e n>0 

+ 2A(£)+D ri 



(20) 



where A(£) is the correlation induced contribution: 

A(£) = J2 G (£ - n)G (£ - n')C 2 (n' - n) 

0<n<n'<e 

+ E [G (£ + n)-G (n)][G (£ + n')-G (n')}C 2 (n'-n) 

0<n<n' 

+ E J2 G o(t-n)[Go(£ + n')-G (n')]C 2 {n' + n). 

0<n<£ n'>0 



(21) 



The constraint from empirical data is that this expression must be approx- 
imately linear in £. As shown in the Appendix, the requirement that T> t {£) is 
strictly linear in £ for all £ in fact allows one to express Gq(£) as a function of 
C2(£)- Here, we present a simple asymptotic argument. If we make the ansatz 
that the bare impact function G (£) also decays as a power-law: 



G (£) 



0^0 



(£ + £Y 



{£>!) 



(22) 



then one can estimate V t (£) in the large £ limit. When 7 < 1, one again finds 
that the correlation induced term A(£) is dominant, and all three terms scale a 
£2-2/3-7^ provided p < 1 j n other words, the Hurst exponent of price changes 



16 



is given by 2H = 2 — 2(3 — 7. Therefore, the condition that the fluctuations are 
diffusive at long times (H = 1/2) imposes a relation between the decay of the 
sign autocorrelation 7 and the decay of the bare impact function (3 that reads: 



2/? + 7 



(3 C 



1-7 



(23) 



For (3 > f3 c , the price is sub-diffusive (H < 1/2), which means that price changes 
show anti-persistence; while for (3 < (3 C , the price is super- diffusive (H > 1/2), 
i.e. price changes are persistent. For FT, 7 £3 1/5 and therefore (3 C ~ 2/5. 

As shown in the Appendix, one can in fact obtain an exact relation between 
G (£) and C2CO A one assumes that price changes are strictly uncorrelated (i.e. 
that T>(£) is linear in I for all £). The asymptotic analysis of this relation leads, 
not surprisingly, to the same exponent relation (3 C = (1 — 7)/2 as above. 

At this stage, there seems still to be a contradiction with empirical data, 
for if one goes back to the response function given by Eq. ([17)1 . one finds that 
whenever (3 + 7 < 1 (which is indeed the case for (3 = (3 C and 7 < 1), the 
dominant contribution to 7Zt{£) should behave as £ x ~P~~i and thus grow with £. 
For example, for 7^1/5 and (3 ~ 2/5, one should find that 1Z t {£) oc £ 2 ^ 5 , which 
is incompatible with the empirical data of Figs. 3 and 4. But the surprise comes 
from the numerical prefactor of this power law. One finds, for large £: 



(lnV)r Co 



r(i-7) 



r(/3)r(2 - 13 - 7 ) 



7T 



7T 



sin7r/5 sin 7i~(l — (3 — 7) 



• (24) 



Therefore, only when (3 = (3 C , is the prefactor exactly zero, and leads to the possi- 
bility of a nearly constant impact function! For faster decaying impact functions 
(larger /?'s), this prefactor is negative, whereas for more slowly decaying impact 
functions this prefactor is positive. 6 Interestingly, even if the bare response func- 
tion Gq(£) is positive for all £, the average response TZ t {£) can become negative 
for large enough (3's, as a consequence of the correlations between trades. 



3.3 Fitting the average response function 

Since the dominant term is zero for the 'critical' case (3 = f3 c , and since we are 
interested in the whole function lZ t {£) (including the small £ regime), we have 
computed lZt(£) numerically, by performing the discrete sum Eq. (|T7|) exactly, 
and fitted it to the empirical response TZ. The results are shown in Fig. 9. We 
have fixed the parameters 7 and Co to the values extracted from the behaviour of 
Ci(£) (see Fig. 8): 7 = 0.24 and Co = 0.20. The overall scaling parameter T is 
adjusted to T = 2.8 10~ 3 Euros to match the value of 1Z(£ = 1). The values of (3 

6 Note that although this prefactor increases (in absolute value) with j3 for (3 > j3 c , the 
power of I decreases, which means that for large I the amplitude of lZt(i) decreases with /?, as 
intuitively expected. 
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Figure 9: Theoretical impact function TZ t {£), from Eq. ([17)1 . and for different 
values of f3 close to (3 C = 0.38. The shape of the empirical response function 
can be quite accurately reproduced using (3 = 0.42. The only remaining free 
parameter is £ = 20. The thick plain line is TZt(£) computed using the 'pure 
diffusion' propagator Gq determined in Appendix, Eq. (J3"4"|) . 



and £o are fitting parameters: we show in Fig. 9 the response function computed 
for different values of (3 in the vicinity of (3 C = 0.38, and used £q = 20. 

The results are compared with the empirical data for FT, showing that one 
can indeed satisfactorily reproduce, when (3 ~ j3 c , a weakly increasing impact 
function that reaches a maximum and then decays. One also sees, from Fig. 9, 
that the relation between (3 and 7 must be quite accurately satisfied, otherwise 
the response function shows a distinct upward trend (for (3 < (3 C ) or a downward 
trend {(3 > (3 C ). 7 In fact, we have tried other simple forms for G (£), such as a 
simple exponential decay toward a possibly non zero asymptotic value, but this 
leads to unacceptable shapes for 1Z(£). 

It is also interesting to use the propagator Gq determined in the Appendix 
from the assumption of a purely diffusive price process for all ts. This propagator 
is plotted in Fig. 10, and compared to the Go determined above from the fit of 
1Z(£). As shown in Fig. 9, the use of Gq does not lead to a very good fit of 
1Z(£). Since the latter quantity is in fact very sensitive to the chosen shape for 
G , it does reveal small, but systematic deviations from a purely diffusive price 
process. [Note that if one had = the resulting TZ(£) should be strictly 

constant.] 

7 This might actually explain the different behaviour of Pechiney seen in Fig. 4. 
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Figure 10: Shape of the bare propagator Gq, determined either by the fit of 71, 
with (3 = 0.42 and £q = 20, or using the exact relation, Eq. (JMj) . derived in the 
Appendix from the assumption of a purely diffusive process. 

3.4 Back to the diffusion constant 

As we showed above, the reason for the fine tuning of j3 is the requirement that 
price changes are almost diffusive. We can therefore also compute T> t (£) for all 
values of £ using the very same values of 7, (3, C , £ and T . Now, in order 
to fit the data one has two extra free parameters: one is D v , and the other 
comes about because the mid-point can change without any trade. One should 
thus add to T> t (£) an ^-independent 'error' term D that survives in the £ = 
limit, and is associated to bid- ask fluctuations. With these two extra parameters, 
one can reproduce the empirical determination of T>(£)/£ (see Fig. 11). The 
small deviations of this quantity from a horizontal line at finite £ are due to the 
difference between Go and Gq and/or to the possible autocorrelations between 
the 7] n variables, which we have neglected here. Note that the contribution of the 
term D, q turns out to be a factor two larger than that of the impact contribution, 
Eq. (|20|) . which means that the small increase of the 'impact contribution' with 
£ (lower graph of Fig. 11) is hardly detectable in V{£)/£. 

Coming back to the Fluctuation-Response relation discussed in Section 2.4, 
we see that our model predicts, for £ ^> 1 where the effect of D can be neglected: 

^ = Z(\n V) 2 C T 2 + D v , Kt{£) = Z'(\n V)T C , (25) 

where Z, Z' are numerical constants. Assuming that from one day to the next 
both the average (log-)traded volume and the impact r of each individual trade 
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Figure 11: Diffusion constant T>(£)/£, using Eq. (|2Uj). with the values of 7, ft, C , 
£0 and To determined from 1Z{£). Two extra parameters were used: D v = 10~ 4 
and Dq = 6.6 (both in Euro squared). The lower graph is the 'impact 
contribution' to T> t (£), given by Eq. (20) with D v = 0. The 'oscillations' at long 
times is a numerical artefact. 

might change, while Cq is fixed, immediately leads to the affine relation between 
T> and 1Z 2 reported in Section 2.4. 

3.5 Discussion 

The conclusion of this Section is that our 'micro- model' of prices, Eq. (|12|) . 
can be used as a theoretical canvas to rationalize and interpret the empirical 
results found in the previous Section. Most surprising is the constraint that the 
empirical results impose on the shape of the 'bare' response function Go, which is 
found to be a slowly decaying power law which must precisely cancel the slowly 
decaying autocorrelation of the trades, but reveals systematic deviations from 
a pure diffusion process, hardly noticeable on the diffusion constant itself. The 
fact that the bare impact function decays with time (at least on intra-day time 
scales), in a finely tuned way to compensate the long memory in the trades, is 
the central result of this paper. This effect is lost in the zero intelligence models 
of Poisonnian order flows, where, after decreasing during a short transient, the 
impact of each trade becomes permanent: G (£) — > G^ > 0. (On this point, see 
the model studied in ^3], where it is shown that prices are sub-diffusive on time 
scales shorter than the life-time of limit orders, essentially as a consequence of 
the shape of the order book). In fact, both the long time memory of the trades 
and the slowly relaxing impact function reported here must be the consequence 
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of the strategic behaviour of market participants, that we discuss below in order 
to get an intuitive understanding of the mechanisms at play. 

Although our detailed analysis concerns FT, it is clear that our conclusions 
are more general, since both the strong autocorrelations in the trade signs, the 
near constancy of the average response function and the diffusive nature of price 
changes have been observed on all stocks, with only quantitative changes (see 
Figs 2 and 4). It would be interesting to document these quantitative difference, 
and relate these to liquidity, or to the size of the bid-ask spread. 

Finally, it would be very interesting to know whether the bare response func- 
tion levels off to a finite value for large time lags; this will require to go beyond 
the analysis of the present paper and to deal with overnight effects to enlarge the 
available range of i values. However, it seems reasonable to expect that Gq(£) 
should indeed reach a finite asymptotic value for values of i corresponding to a 
few days of trading. 8 

4 Critical balance of opposite forces: Market 
orders vs. limit orders 

Although trading occurs for a large variety of reasons, it is useful to recognize 
that traders organize in two broad categories: 

• One is that of 'liquidity takers', that trigger trades by putting in market 
orders. The motivation for this category of traders might be to take advan- 
tage of some 'information', and make a profit from correctly anticipating 
future price changes. Information can in fact be of very different nature: 
fundamental (firm based), macro-economical, political, statistical (based 
on regularities of price patterns), etc. Unfortunately, information is often 
hard to interpret correctly, and it is probable that many of these 'informa- 
tion' driven trades are misguided (on this point, see the remarkable work of 
Odean see also [7j and refs. therein). For example, systematic hedge 
funds which take decisions based on statistical pattern recognition have a 
typical success rate of only 52%. There is no compelling reason to believe 
that the intuition of traders in markets room fares much better than that. 
Since market orders allows one to be immediately executed, many impa- 
tient investors, who want to liquidate their position, or hedge, etc. might be 
tempted to place market orders, even at the expense of the bid-ask spread 
s(t) = a (t) - b(t). 

• The other category is that of 'liquidity providers' (or 'market makers', al- 
though on electronic markets all participants can act as liquidity providers 

8 Hopman quotes three days as the time beyond which the autocorrelation of the trades sign 
falls to zero 0. 
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by putting in limit orders), who offer to buy or to sell but avoid taking any 
bare position on the market. Their profit comes from the bid-ask spread 
s: the sell price is always slightly larger than the buy price, so that each 
round turn operation leads to a profit equal to the spread s, at least if the 
midpoint has not changed in the mean time (see below). 

This is where the game becomes interesting. Assume that a liquidity taker 
wants to buy, so that an increased number of buy orders arrive on the market. 
The liquidity providers is tempted to increase the offer (or ask) price a because 
the buyer might be informed and really know that the current price is too low 
and that it will most probably increase in the near future. Should this happen, 
the liquidity provider, who has to close his position later, might have to buy back 
at a much higher price and experience a loss. In order not to trigger a sudden 
increase of a that would make their trade costly, liquidity takers obviously need 
to put on not too large orders. This is the rationale for dividing one's order 
in small chunks and disperse these as much as possible over time so as not to 
appear on the 'radar screens'. Doing so liquidity takers necessarily create some 
temporal correlations in the sign of the trades. Since these traders probably have 
a somewhat broad spectrum of volumes to trade [SB], and therefore of trading 
horizons (from a few minutes to several weeks), this can easily explain the slow, 
power-law decay of the sign correlation function Cq(£) reported above. 

Now, if the market orders in fact do not contain useful information but are 
the result of hedging, noise trading, misguided interpretations, errors, etc., then 
the price should not move up on the long run, and should eventually mean revert 
to its previous value. Liquidity providers are obviously the active force behind 
this mean reversion, again because closing their position will be costly if the price 
has moved up too far from the initial price. More precisely, a computation of the 
liquidity provider average gain per share Q can be performed jTH], and is found 
to be, for trades of volume V: 

g = s + 72(0, V) - K{oa, V) » s + In V [72(0) - 72(oo)] , (26) 

where 72.(0, V) is the immediate average impact of a trade, before new limit orders 
set in. We have in fact checked empirically that 72.(0, V) ~ 72(1, V). From the 
above formula, one sees that it is in the interest of liquidity providers to mean 
revert the price, such as to make 72(oo) as small as possible. However, this mean 
reversion cannot take place too quickly, again because a really informed trader 
would then be able to buy a large volume at a modest price. Hence, this mean 
reversion must be slow. From the quantitative analysis of Section 3, we have 
found that there is hardly any mean reversion at all on short time scales i < £q, 
and that this effect can be described as a slow power-law for larger £'s. Actually, 
the action of liquidity providers and liquidity takers must be such that no (or 
very little) linear correlation is created in the price changes, otherwise statistical 
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arbitrage opportunities would be created at the detriment of one or the other 
population. 

To summarize: liquidity takers must dilute their orders and create long range 
correlations in the trade signs, whereas liquidity providers must correctly handle 
the fact that liquidity takers might either possess useful information (a rare sit- 
uation, but that can be very costly since the price can jump as a result of some 
significant news) , or might not be informed at all and trade randomly. By slowly 
mean reverting the price, market makers minimize the probability that they ei- 
ther sell too low, or have to buy back too high. The delicate balance between 
these conflicting tendencies conspire to put the market at the border between 
persistence (if mean reversion is too weak, i.e. (3 < /3 C ) or anti-persistence (if 
mean reversion is too strong, i.e. (3 > f3 c ), and therefore eliminate arbitrage 
opportunities. 

It is actually enlightening to propose a simple model that could explain how 
market makers enforce this mean reversion. 9 Assume that upon placing limit 
orders, there is a systematic bias toward some moving average of past prices. 
If this average is for simplicity taken to be an exponential moving average, the 
continuous time description of this will read: 

* = <p t -p«), (27) 

where r\ t is the random driving force due to trading, Q the inverse time scale for 
the strength of the mean reversion, and 1/k the 'memory' time over which the 
average price p t is computed. The first equation means that liquidity providers 
tend to mean revert the price toward p t , while the second describes the update 
of the exponential moving average p t with time. This set of linear equations can 
be solved, and leads to a solution of the form p t = J dt'Go(t — t')r)' t , with a bare 
propagator given by: 

G (t) = (1 - Goo) exp[-(fi + K )t] + Goo, (28) 

i.e. an exponential decay toward a finite asymptotic value Goo = /c/(fi + ac). 
Note that, interestingly, it is the self-referential effect that leads to a non zero 
asymptotic impact. If the fundamental price was known to all, k = and Gqo = 0. 
In the opposite limit where k 3> fl, the last price is taken as the reference price, 
and Goo ~~ * !■ A way to obtain Go(t) to resemble a power-law is to assume 
that different market makers use different time horizons to compute a reasonable 
reference price. This leads to a Go(t) which writes as the sum of time exponentials 
with different rates which can easily mimic a pure power-law. 

9 We have in fact directly checked on the data that the evolution of the midpoint between 
trades (resulting from the order flow) is indeed anticorrelated with the impact of the trades. 
On this point, see also where the limit order flow subsequent to a trade is studied. 
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The message of the above model is actually quite interesting from the point of 
view of Efficient Markets: it suggests that nobody really knows what the correct 
reference price should be, and that its best proxy is in fact its own past average 
over some time window (the length of which being itself distributed over several 
time scales). 

5 Summary and Conclusion 

The aim of this paper was to study in details the statistics of price changes at the 
trade by trade level, and to analyze the interplay between the impact of each trade 
on the price and the volatility. Empirical data shows that (a) the price (midpoint) 
process is close to being purely diffusive, even at the trade by trade scale (b) the 
temporal structure of the impact function first increases and reaches a maximum 
after 100 — 1000 trades, before decreasing back, with a rather limited overall 
variation (typically a factor 2) and (c) the sign of the trades shows surprisingly 
long range (power-law) correlations. The paradox is that if the impact of each 
trade was permanent, the price process should be strongly super-diffusive and 
the average response function should increase by a large factor as a function of 
the time-lag. 

As a possible resolution of this paradox, we have proposed a micro-model 
of prices, Eq. (|T2*|) where the price at any instant is the causal result of all 
past trades, mediated by what we called a bare impact function, or propagator 
Gq. All the empirical results can be reconciled if one assumes that this bare 
propagator also decays as a power-law in time, with an exponent which is precisely 
tuned to a critical value, ensuring simultaneously that prices are diffusive on long 
time scales and that the response function is nearly constant. Therefore, the 
seemingly trivial random walk behaviour of price changes in fact results from 
a fined-tuned competition between two opposite effects, one leading to super- 
diffusion (the autocorrelation of trades) and the other leading to sub-diffusion 
(the decay of the bare impact function). The cancellation is however not exact: 
the non trivial behaviour of the average response function allows one to detect 
small, but systematic deviations from a purely diffusive behaviour, deviations 
that are hardly detectable on the price fluctuations themselves. 

In financial terms, the competition is between liquidity takers, that create 
long range correlations by dividing their trading volume in small quantities, and 
liquidity providers that tend to mean revert the price such as to optimize their 
gains (see Eq. (|26j)). The resulting absence of correlations in price changes, and 
therefore of arbitrage opportunities is often postulated a priori in the economics 
literature, but the details of the mechanism that removes these arbitrage oppor- 
tunities are rather obscure. The main message of this paper is that the random 
walk nature of price changes is not due to the unpredictable nature of incoming 
news, but appears as a dynamical consequence of the competition between an- 
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tagonist market forces. In fact, the role of real (and correctly interpreted) news 
appears to be rather thin: we have defined a model independent indicator of the 
fraction of 'informed' trades, as the asymmetry of the probability distribution 
of the signed price variation, where the sign is that of the trade at the initial 
time. Information triggered trades should reveal in a detectable positive skew of 
this distribution, in particular in the tails. Consistently with other studies |35j, 
our empirical results only show very weak asymmetry, barely sufficient to cover 
trading costs, which means that only a small fraction of trades can a posteri- 
ori described as truly informed, whereas most trades can be classified as noise. 
This result is most probably one of the mechanism needed to explain the excess 
volatility puzzle first raised by Schiller 

From a more general standpoint, our finding that the absence of arbitrage 
opportunities results from a critical balance between antagonist effects is quite 
interesting. It might justify several claims made in the (econo-)physics litera- 
ture that the anomalies in price statistics (fat tails in returns described by power 
laws [2011^]; long range self similar volatility correlations |2*H I25| . and the long 
ranged correlations in signs reported here and in [T2J) are due to the presence 
of a critical point in the vicinity of which the market operates (see e.g. [39J, 
and in the context of financial markets [J01E])- If a fine-tuned balance between 
two competing effects is needed to ensure absence of arbitrage opportunities, one 
should expect that fluctuations are crucial, since a local unbalance between the 
competing forces can lead to an instability. In this respect, the analogy with the 
balancing of a long stick is quite enticing JH] • I n more financial terms, the break- 
down of the conditions for this dynamical equilibrium is, for example, a liquidity 
crisis: a sudden cooperativity of market orders, that lead to an increase of the 
trade sign correlation function, can out-weight the liquidity providers stabilizing 
(mean-reverting) role, and lead to crashes. This suggests that one should be able 
to write a mathematical model, inspired by our results, to describe this 'on-off 
intermittency' scenario, advocated (although in a different context) in [T§1 1^2*1 H3~] . 

Acknowledgments: Yuval Gefen thanks Science & Finance/Capital Fund Management 
for hospitality during the period this work was completed. We thank Jelle Boersma, 
Lisa Borland and Bernd Rosenow for inspiring remarks, and J. Doyne Farmer for com- 
municating his results on the autocorrelation of trades before publication and for 
insightful discussions and comments. We also thanks S. Picozzi for an interesting dis- 
cussion and for pointing out the possible relevance of ref. to financial markets. 
Xavier Gabaix has made some crucial remarks on the first version of the manuscript, 
that in particular lead to the material contained in the Appendix, and draw our at- 
tention to ref. [7]. Finally, we take the opportunity of this paper to acknowledge the 
countless efforts of Gene Stanley to investigate the dynamics of complex systems and 
to bring together different fields and ideas - as testified by the papers cited in reference, 
which inspired the present work. 



25 



Appendix: The case of a strictly diffusive process 



This appendix was inspired by a remark of Xavier Gabaix. There is one particular 
case of our micro-model of prices, Eq. (j!2j) . where prices are purely diffusive at 
all times (rather than only asymptotically). This is the case provided a specific 
relation between the bare propagator G and the sign correlation function C 2 (£) 
holds. In order to show this, let us assume that the random variable q n = e n In V n 
can be written as: 

q n = J2 K(n - m)U, (29) 

m<n 

where £ n are uncorrelated random variables ((£ n £m) — (hi 2 ^0<5n,m), and K{.) a 
certain kernel. In order for the q n to have the required correlations, the kernel 
K(.) should obey the following equation: 

C 2 (n) = (In 2 V) Y, K ( m + n)K(m). (30) 

In the case where C 2 decays as £ -7 with < 7 < 1, it is easy to show that the 
asymptotic decay of K(n) should also be a power-law n~ s with 25 — 1 = 7. Note 
that 1/2 < 5 < 1. 

Inverting Eq. (|2*9~|) allows one to obtain a set of uncorrelated random variables 
£ n from a set of correlated variables q n : 

U = Qi 71 ~ m )<?™, (31) 

where Q is the matrix inverse of K, such that J2m=o K( n ~ rn)Q{m) = <5 m>n . Eqs. 
1)29131)) in fact form the basis of linear filter theories, and £ n can be seen as the 
prediction error on the next variable q n . 
Introducing discrete Laplace transforms: 

K{E) = £ K(n)e- nE Q{E) = £ Q{n)e~ nE , (32) 

n>0 n>0 

one finds K{E)Q(E) = 1. For a power-law kernel K(.), one obtains: Q(E) oc 
fii-s f or £ _^ an( j therefore Q(n) oc n s ~ 2 for large n. It is useful to note that 
in this case Q(E = 0) = J2 n >o Q( n ) — 0- 

Now, it is clear that if one defines the price process p n as: 

Vn = Y (33) 

m<n 

then p n is a diffusion process with a strictly linear T>(£), since the £'s are by 
construction uncorrelated. The price defined in this way can also be written, 
using Eq. (|3~Tj) . as a linear combination of past g m 's, as assumed in our micro- 
model Eq. ((12]), with: 

G* (£) = Y Q{m). (34) 

m=0 
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This is an exact relation between C2 (that allows one to compute in turn K and 
Q) and the response function Gq for all ts, where the star indicates that strict 
diffusion is imposed. 

In the case of power-law kernels, one finds from the above relation and from 
Q(n) oc n s ~ 2 for large n: 

G*(£) ex I s - 1 -^(3=1-5 = i^l (35) 

which is, not surprisingly, the relation obtained in the main text from the as- 
sumption that prices are diffusive on long time scales. 

Eq. can be used to construct Gq from the empirical determination of C 2 , 
shown in Fig. 10. In order to obtain this curve, we have fitted £2(71) as: 

7.16 

C 2 (0)=33.5; fc(„) = ___ (36) 

and used the Levinson-Durbin recursion algorithm for solving a Toeplitz system 
(see, e.g., HU). 
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